perm filename CBCL[F75,JMC]8 blob sn#610454 filedate 1981-09-04 generic text, type C, neo UTF8
COMMENT āŠ—   VALID 00002 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00002 00002	@make(article)
C00018 ENDMK
CāŠ—;
@make(article)
@style(linewidth 6 inches)
@style(indent 0,spacing 1.5,spread 0.5)
@style(topmargin 0.5 inches,bottommargin 0.2 inches,leftmargin 1.25 inches)
@pageheading()

@begin(heading)
THE COMMON BUSINESS COMMUNICATION LANGUAGE
@end(heading)

@center(John McCarthy)

	Here are some  ideas about the  value of a  @p(common business
communication   language)  (CBCL   for   short)   and   what   its
characteristics might be.  Besides its practical significance, CBCL
raises issues concerning the semantics of natural language.

	The need  for  such a  language was  suggested  to me  by  an
article by Paul  Baran that appeared in @p(Public  Interest) in 1965.
In  this article, Baran  envisaged a  world of the  future in which
companies would be well  equipped with on-line computer systems.
The inventory  control computer of  company A
would write on  the screen of  a clerk  in the  purchasing
department a statement that 1000 gross  of such-and-such pencils were
needed and  that they should be purchased from  company B.  The clerk
would turn  to her  typewriter and  type out  a purchase  order.   At
company B another clerk would receive  the purchase order and turn to
her  terminal and  tell the  computer to  arrange to ship  the pencils.
Eliminating both clerks by  having the
computers  speak directly  to  each other  was not  mentioned.
Perhaps the author felt  that he was already straining  the
credulity of his audience.

	Suppose we  wish to eliminate the  clerks by
having  the computers speak directly  to each other.   @b(What are the
requirements?)

	First, computers do communicate directly now.  In the late
1950s the Social Security Administration announced a format for
IBM  seven channel magnetic tape on which  it was prepared to receive
reports of earnings  and payroll deductions.   Note the  limitations:
(i)  magnetic   tapes  are  mailed  rather   than  direct  electronic
communication - admittedly entirely appropriate in this case.  (ii) A
single fixed kind of message with a fixed  set of parameters for each
report.    (iii)  There is only one receiver of information which can
dictate the format.   Today information is often exchanged electronically
among
computer systems belonging to different organizations, but this is usually
by specific treaty betwen the two organizations, but sometimes a group
that  will  be communicating  has  agreed on  formats.
An example is the U.S. Navy's system for exchanging  information
among ships about what their radars and other sensors can see so that
each  ship can  have the  full  radar picture  acquired by  the whole
fleet.  In connection  with the extension of  the system to NATO,  it
was  completely  redesigned,  and  on a  designated  day,  all  users
switched to the new system.

	Our goal is more ambitious in the following respects:

@begin(enumerate)
A common language  is  to  be adopted  that  can express
business communications.  For example, requests for price quotations,
offers  to buy  and sell,  queries about  delivery times  and places,
inquiries about the status of delayed orders, references to  standard
commercial legal agreements.  If possible, the same language should
with only different primitives suffice to communicate the Navy's
or the FAA's radar information or a request from one state's department
of motor vehicles to another's for a list of a person's traffic
convictions.

Any organization should  be able to communicate with  any other
without pre-arrangement  over ordinary dial-up telephone connections.
Of course, this requires  authentication procedures and  verification
of authorization procedures,  but let us not be  unduly distracted by
the security aspects of computing lest we end up with a secure method
of communication and nothing to say.

The  system  should  be open  ended  so that  as  programs
improve, programs that  can at first only order  by stock numbers can
later be programmed  to inquire about  specifications and prices  and
decide  on  the best  deal.    This  requires that  each  message  be
translatable into  a human-comprehensible form and that each computer
have a  way  of  referring  messages  it is  not  yet  programmed  to
understand to humans.   When a new type of message  is to displace an
old one,  the programs should send both
until all the receivers can understand the
new form.   Thus the  crises of cutover days, as  in the naval
example, could be eliminated.

CBCL is strictly a communication protocol.  It should not
presuppose any data-base format for the storage within machines of
the information communicated, and it should not presuppose anything
about the programs that use the language.  Each business using the
language would have a program designed to use the particular part
of CBCL relevant to its business communications.  Thus CBCL presupposes
nothing about the programs that decide when to order or what orders
to accept.

CBCL is not concerned with the low-level  aspects of  the
message formats,  i.e., what kinds  of bit streams  and what  kinds of
modems, except to  remark that the system should avoid traps in these
areas,  and  the  users  should  be  able  to  change  their  systems
asynchronously.  Presumably CBCL would use the same low level protocols
used for more simple inter-business communications like person-to-person
messages and file transfer.
@end(enumerate)

	We do not have a final proposal but here are some ideas:
@begin(enumerate)
The messages are lists of items punctuated by parentheses.
The lead item of each list identifies the type of message and is used
to determine  how to interpret  the rest.   The  items may be  either
sublists or atoms.  If  an item is a sublist, its first element tells
how to interpret it.   Atoms are  binary numbers of say  32 bits.   A
dictionary tells what each means.  Other forms of data
may   be  used  provided  they   are  demarcated  by  appropriate
punctuation and provided they are pointed at from lists that tell how
they are to be interpreted.

Here are some examples:

@begin(enumerate)
(REQUEST-QUOTE (YOUR-STOCK-NUMBER A7305) (UNITS 100))

(REQUEST-QUOTE (PENCILS #2) (GROSS 100))

(REQUEST-QUOTE  (ADJECTIVE  (PENCILS #2)  YELLOW)  (GROSS
100))

(WE-QUOTE   (OUR-STOCK-NUMBER   A7305)   (QUANTITY  100)
(DELIVERY-DATE 3-10-77) (PRICE @$1.00))

(PLEASE-SAY (IOTA (X) (AND (RED X) (PENCIL X))))
@end(enumerate)@end(enumerate)
It  appears  that  some  items  may  require  a  variable  number  of
modifiers.

	As  a toy  example,  imagine writing  conventions  that would
permit any Monopoly-like game to  be played by independently  written
programs.  Suppose that  the moves are communicated to  a referee who
receives requests to roll the dice and returns information about what
squares the pieces landed on and what "chance" cards were drawn.  The
programs would  communicate offers to  buy and sell directly  to each
other and to the "banker".

	CBCL should satisfy an important principle enunciated by Chomsky
in his @b(Reflections on Language) as a characteristic of human language.
The principle (reworded considerably) is that no grammatical position
should require an identifier or a number per se but should allow a phrase.
For example, instead of requiring a stock number, an expression designating
the stock number, such as "the same stock number as last week" or "the
new stock number of the item that was formerly stock number 2531".  We
don't really mean these English phrases but rather whatever they
translate into.

@heading(CBCL and natural language)

	Developing an expressive CBCL has proved unexpectedly difficult.
Even concentrating on the idea of a purchase order doesn't easily
lead to defining formats that permit expressing all that should be
possible to include in a purchase order.  The problem is that every
aspect of the purchase order such as the delivery method or the terms of
payment seems to admit infinite variation and elaboration.  It is a
semantic feature of natural language that this elaboration is possible.
The problems do not at all stem from the rigid list syntax of CBCL,
which after all resembles the result of parsing a natural language
text.  The problems are in the semantics, i.e., in specifying what
should be expressible.

	This suggests that the problem of formalizing what is expressible
in natural language can and should be studied entirely separately from
the syntax.  In addition it suggests that putting natural language
front ends on computer programs often entirely misses the key problems
of natural language.  Namely, before the natural language front end
is attached, the programmer has already decided what things shall be
sayable, and they are usually things that can readily be said in a
pre-existing input-output system.  But if we are right, the most difficult
problems in making a computer use language involve deciding what is
to be sayable.

	For example, consider some possible specifications of the
method of delivery.

@begin(enumerate)
By air excluding Capital Airlines.

By air excluding Capital Airlines provided this doesn't
delay the shipment more than a day.

As soon as possible without incurring extra charges.

By truck complying with the rules on shipment of explosives
(even though the present shipment isn't classified as explosive).

By truck making sure our competitor doesn't learn the
size and model number of the item shipped.
@end(enumerate)

Unfortunately, these few examples do not show the scope of the problem.

Development of CBCL and implementation of programs using it is being
studied in collaboration with the Artificial Intelliigence Center of
SRI International.